RDF/XML Streaming Parser
A fast, streaming RDF/XML parser
that outputs RDFJS-compliant quads.
Installation
$ yarn install rdfxml-streaming-parser
This package also works out-of-the-box in browsers via tools such as webpack and browserify.
Require
import {RdfXmlParser} from "rdfxml-streaming-parser";
or
const RdfXmlParser = require("rdfxml-streaming-parser").RdfXmlParser;
Usage
RdfXmlParser
is a Node Transform stream
that takes in chunks of RDF/XML data,
and outputs RDFJS-compliant quads.
It can be used to pipe
streams to,
or you can write strings into the parser directly.
Print all parsed triples from a file to the console
const myParser = new RdfXmlParser();
fs.createReadStream('myfile.rdf')
.pipe(myParser)
.on('data', console.log)
.on('error', console.error)
.on('end', () => console.log('All triples were parsed!'));
Manually write strings to the parser
const myParser = new RdfXmlParser();
myParser
.on('data', console.log)
.on('error', console.error)
.on('end', () => console.log('All triples were parsed!'));
myParser.write('<?xml version="1.0"?>');
myParser.write(`<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:ex="http://example.org/stuff/1.0/"
xml:base="http://example.org/triples/">`);
myParser.write(`<rdf:Description rdf:about="http://www.w3.org/TR/rdf-syntax-grammar">`);
myParser.write(`<ex:prop />`);
myParser.write(`</rdf:Description>`);
myParser.write(`</rdf:RDF>`);
myParser.end();
Import streams
This parser implements the RDFJS Sink interface,
which makes it possible to alternatively parse streams using the import
method.
const myParser = new RdfXmlParser();
const myTextStream = fs.createReadStream('myfile.rdf');
myParser.import(myTextStream)
.on('data', console.log)
.on('error', console.error)
.on('end', () => console.log('All triples were parsed!'));
Configuration
Optionally, the following parameters can be set in the RdfXmlParser
constructor:
dataFactory
: A custom RDFJS DataFactory to construct terms and triples. (Default: require('@rdfjs/data-model')
)baseIRI
: An initial default base IRI. (Default: ''
)defaultGraph
: The default graph for constructing quads. (Default: defaultGraph()
)strict
: If the internal SAX parser should parse XML in strict mode, and error if it is invalid. (Default: false
)trackPosition
: If the internal position (line, column) should be tracked an emitted in error messages. (Default: false
)allowDuplicateRdfIds
: By default multiple occurrences of the same rdf:ID
value are not allowed. By setting this option to true
, this uniqueness check can be disabled. (Default: false
)validateUri
: By default, the parser validates each URI. (Default: true
)iriValidationStrategy
: Allows to customize the used IRI validation strategy using the IriValidationStrategy
enumeration. IRI validation is handled by validate-iri.js. (Default: IriValidationStrategy.Pragmatic
)
new RdfXmlParser({
dataFactory: require('@rdfjs/data-model'),
baseIRI: 'http://example.org/',
defaultGraph: namedNode('http://example.org/graph'),
strict: true,
trackPosition: true,
allowDuplicateRdfIds: true,
validateUri: true,
});
License
This software is written by Ruben Taelman.
This code is released under the MIT license.